Wire tier-2a Langfuse trace-shape fixtures#185
Merged
Conversation
Move six fixtures (022/031/032 Langfuse trace + observation tree, 035/036 caller-invocation-id derivation, 059 implementation attribution) from _UNIT_TESTED_FIXTURES into _SUPPORTED_FIXTURES, driven through a LangfuseObserver + InMemoryLangfuseClient recorder. Second tier of the fixture-harness catch-up; test-only, no library change, no pin bump. Adds a Langfuse-trace runner plus a value-matcher for the placeholder tokens (<uuid-hex>, <any-string>, <corr_id_N> first-occurrence binding) and the assertion sub-key matchers (harness_parameterized, non_empty_string), and an invocation-id runner. The fixture trace.id is the derived Langfuse id, so the harness bridges the recorder's raw invocation_id through the impl's own langfuse_trace_id. No deferrals; 023/024 (Langfuse Generation) are tier 2b.
There was a problem hiding this comment.
Pull request overview
Wires the tier-2a Langfuse “trace-shape” conformance fixtures into the YAML observability harness by adding Langfuse-specific runners and matchers so these fixtures can be executed end-to-end via LangfuseObserver + InMemoryLangfuseClient.
Changes:
- Adds YAML-harness drivers for Langfuse trace-shape fixtures (022/031/032/059) and invocation-id derivation fixtures (035/036).
- Introduces a Langfuse value-matcher to support placeholder tokens and assertion sub-key matchers used by these fixtures.
- Extends
_assert_langfuse_observation_treeto optionally use the matcher path while preserving exact-match behavior for existing tool fixtures.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address review feedback on the tier-2a fixture wiring: - _run_langfuse_trace_case wraps invoke/drain in try/finally so the observer is always shut down, even if the graph raises. - _run_invocation_id_case now holds the LangfuseObserver reference and shuts it down in finally, matching the other Langfuse runners. - _assert_langfuse_observation_tree enables the value-matcher only when both bindings and params are provided (and, not or), so a partial call degrades to exact match instead of half-enabling the matcher.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Second tier of the conformance-harness fixture catch-up: wire the trace-shape
Langfuse observability fixtures into the YAML harness, driven through a
LangfuseObserverbacked by the in-memory recorder. Test-only, no librarychange, no pin bump. (The two Langfuse Generation fixtures, 023/024, follow in
tier 2b.)
Wired (6)
Moved from
_UNIT_TESTED_FIXTURESto_SUPPORTED_FIXTURES:subgraph hierarchy, fan-out per-instance)
dashes-stripped, and sha256 for a non-UUID)
Harness machinery added
_run_langfuse_trace_fixture: build a graph via the adapter, record into anInMemoryLangfuseClient, and assert the Trace id / name / metadata plus theobservation tree.
<uuid-hex>,<any-string>,and
<corr_id_N>first-occurrence binding for the correlation-id-consistencycheck) and the assertion sub-key matchers (
harness_parameterized,non_empty_string)._assert_langfuse_observation_treegains an opt-in matcher path; the existingtool-fixture caller keeps its exact-match behavior.
_run_invocation_id_fixturefor 035/036.A note on trace.id
The fixtures assert the derived Langfuse trace id, while the in-memory recorder
keys traces by the raw OpenArmature invocation_id (the real SDK adapter derives
the OTel id via
_to_otel_trace_id). The harness bridges by running the raw idthrough the implementation's own
langfuse_trace_id, which is exactly thathelper's purpose. The in-memory double storing the raw id rather than the
derived one is a minor fidelity gap in that public testing utility, tracked as a
separate follow-up.
Testing
tests/conformance/test_observability.py: 70 passed, 42 skipped.tests/: 1462 passed, 408 skipped.